Agenda for Chemometricians 



It is most appropriate that the proceedings of this conference are going to be dedicated to the 
memory of Jack Youden. He was interested in many of the topics that are being considered at 
this conference, for example, interlaboratory comparisons, calibration, analytical methods, and 
measurement errors — both systematic and random. He was indeed a pioneering chemometri- 
cian, before the name existed. He was also interested in explaining to chemists, chemical 
engineers, and others how they could benefit by using statistical methods. 

I'm sure Youden would have been pleased with this conference, which provides a forum for 
chemists, statisticians, and others interested in chemometrics to discuss research of mutual 
interest. He also might have observed that chemometrics as a field has reached a level of 
maturity that warrants consideration of questions related to spreading the word to others, to 
non-chemometricians, so that they could take advantage of the techniques that are now avail- 
able. In other words, perhaps chemometrics as discipline has reached a sufficiently advanced 
stage of research and development that questions of production should now be addressed. What 
are our most useful products? Who are out customers? Which products would they find most 
valuable? What are the obstacles that prevent these customers from using these products now? 
How can these obstacles be overcome? What are the most important things that can be done in 
the next three years to reach new customers? What should the agenda be for chemometricians 
in the next few years? 

There are two ways to learn. One is to listen, as in a lecture. The other is to engage in a 
dialogue, as in a conversation. The first way is passive. The second is active. Let's try the 
second way to learn from one another how we might answer these questions. 

[Participants at this point wrote out answers to these questions, discussed them, and voted on 
them. The top vote-getters for the most important things that can be done in the next three years 
to reach new customers were the following, listed in order of decreasing number votes: 

1. Organize joint conferences with chemists. 

2. Write textbooks on chemometrics. 

3. Conduct workshops and teach short courses. 

4. Write user-friendly software. 

5. Teach chemometrics to graduate students. 
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6. Write tutorial, expository, and review articles. 

7. Undertake joint research projects with chemists. 

8. Publicize success stories. 

9. Teach chemometrics to undergraduate students. 

10. Communicate with management. 

11. Hire professionals to help with a public relations effort. 

12. Teach chemometrics to high school students.] 

I recommend that we take action on the basis of this list. Let me now make a few observations 
in closing. I would like to suggest a different starting point for statistics courses. Let us represent 
the relationship between an observed response y and variables jt^,. . . as 

y~f(X-l> x 2' X i' x 4> x 5 *126>' • ■) • 

Many, many, many variables affect y. It is the fluctuation of these variables that gives us 
different answers when we repeat an experiment two or more times under "identical conditions," 
We are often interested in creating a mathematical equation (model) that involves a subset of the 
variables. For purposes of illustration, suppose this subset is (X],,xj). We can then write 

y=f(x.i,x 2 )+g(x l ,x 2 ,x 3 j; 4 ,x 5 ,. . -yX l26 „ . .) . 

Note that the g function includes Xi and x 2 (because of lack of fit of the model) as well as all 
the other x 's. Lack of fit occurs, for example, because the model f may be taken to be linear 
in X) and x 2 but the actual relationship may be nonlinear in x v and x 2 . The function g is most often 
called experimental error, and it is almost as often endowed by writers with an abundance of 
desirable and well-known properties. They call it a random variable. A sequence of these 
experimental errors, they frequently say, can be assumed to be independent, identically dis- 
tributed according to a Normal distribution with a zero mean and constant variance. I believe 
that statisticians too readily make this assumption and others like it. Sometimes such an assump- 
tion makes sense, sometimes not. We should be more careful on this point. 

An adequate model is a function that will turn data into white noise, as George Box has said. 
An analogy that I find useful involves a process for separating gold particles from a slurry. If 
the process is fully efficient, the waste stream will contain no gold. It is therefore prudent to 
check the waste stream to see if it contains any gold. Likewise in creating and fitting models, 
it makes sense to examine residuals to see if they contain any information. The data contain 
information (that's the gold we want to get), and a good model will extract all the information 
in those data. Hence the residuals will be manifestations of white noise, an informationless 
sequence of values. 

Chemists and chemical engineers could benefit from knowing more about variance compo- 
nents, statistical graphics, and quality control techniques (including Shewhart and cumulative 
sum charts). But, above all, I think they would find statistical experimental designs to be the 
most useful thing of all that chemometricians have to offer. Such designs provide a practical 
means for increasing research efficiency, which might be defined as the amount of information 
one obtains per dollar spent. 

The damage done by poor experimental design is irreparable. A poor design results in data 
that contain little information. Consequently, no matter how thorough, how clever, or how 
sophisticated the subsequent analysis is, little information can be extracted. A good design, for 
the same expenditure of time, money, and other resources, results in data rich in information. 
A fruitful analysis is then possible. (Note that analysis is defined as trying to extract all the useful 
information in the data.) 

Two-level factorial and fractional factorial designs can be extremely useful for chemists, 
chemical engineers, and others who do similar work. One of the best ways for a student to learn 
about such designs is to set one up, get the data, analyze them, and interpret the results. For a 
number of years I have had students in our experimental design course undertake such projects. 
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The main piece of advice I give them is to work on something they care about, something they 
are really interested in. 

Toward the end of an introductory one-semester undergraduate course in statistics, for exam- 
ple, one student said that he was a pilot and that, ever since he started to fly, he had asked 
instructors and other pilots what he should do if the engine failed on takeoff. He had been told 
by several people that he should bank the plane, go into a 180° turn, and land on the runway 
from which he took off. Unfortunately, many different ways of doing this maneuver had been 
suggested. He successfully organized and executed a replicated 2 3 factorial design with three 
variables: bank angle, flap angle, and speed. He measured the loss in altitude. He started each 
test at 1000 feet instead of ground level. The experiment was a success. He learned which 
combination of factors he should use for his plane, and he discovered the minimum altitude for 
attempting such a maneuver. 

Factorial designs can be understood and run with profit by graduate, undergraduate, senior 
high school, and junior high school students. Maybe younger students can use them, too. 
Students can study the baking of cakes, the riding of bicycles, the making of chemicals, the 
growing of plants, and the swinging of pendulums. Dalia Sredni, when she was a seventh grader, 
for instance, studied the effects of changing oven temperature, baking time, and the amount of 
baking soda when making a cake. Students should be told about factorial designs early so that 
they can study systems that depend on many variables and learn how they work. Using such 
designs they can discover interesting things, have fun, and be surprised. Our students deserve 
more of these pleasures. I have included a list of 101 experiments that have been done by 
students at Wisconsin, to indicate the variety of things that is possible. 

I would like to end by congratulating the conference organizers for the excellent job they have 
done. It is clear that they have worked hard to make 1 things enjoyable and rewarding for those 
of us who have been fortunate enough to participate. 



William G. Hunter 

Professor of Statistics and Industrial Engineering 

Director of Center for Quality and Productivity Improvement 

University of Wisconsin — Madison 



Table 1. List of some studies done by students in an experimental design course at the University of Wisconsin — Madison. 
variables responses 



1. seat height (26, 30 inches), generator (off, on), tire pressure (40, 55 psi) 


time to complete fixed course on bicycle and pulse 
rate at finish 


2, brand of popcorn (ordinary, gourmet), size of batch (1/3, 2/3 cup), popcorn to oil 
ratio (low, high) 


yield of popcorn 


3. amount of yeast, amount of sugar, liquid (milk, water), rise temperature, rise time 


quality of bread, especially the total rise 


4. number of pills, amount of cough syrup, use of vaporizer 


how well twins, who had colds, slept during the night 


5. speed of fdm, light (normal, diffused), shutter speed 


quality of slides made close up with flash attachment 
on camera 


6. hours of illumination, water temperature, specific gravity of water 


growth rate of algae in salt water aquarium 


7. temperature, amount of sugar, food prior to drink (water, salted popcorn) 


taste of Koolaid 


8. direction in which radio is facing, antenna angle, antenna slant 


strength of radio singal from particular AM station in 
Chicago 


9. blending speed, amount of water, temperature of water, soaking time before blend- 
ing 


blending time for soy beans 
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Table 1. continued 



variables 



responses 



iO. charge lime, digits fixed, number of calculations performed 


operation time for pocket calculator 


1 1. clolhes dryer (A, B), temperature setting, load 


time until dryer stops 


12. pan (aluminum, iron), burner on store, cover for pan (no, yes) 


time to boil water 


13. aspirin buffered"? (no, yes), dose, water temperature 


hours of relief from migraine headache 


14. amount of milk powder added to milk, healing temperature, incubation temperature 


taste comparison of homemade yogurt and commercial 
brand 


15. pack on back (no, yes), footwear (tennis shoes, boots), run (7, 14 flights of steps) 


time required to run up steps and heartbeat at top 


16. width to height ratio of sheet of balsa wood, slant angle, dihedral angle, weight 
added, thickness of wood 


length of flight of model airplane 


17. level of coffee in cup, devices (nothing, spoon placed across top of cup facing up), 
speed of walking 


how much coffee spilled while walking 


18. type of stitch, yam guage, needle size 


cost of knitting scarf, dollars per square foot 


19. type of drink (beer, rum), number of drinks, rate of drinking, hours after last meal 


time to get steel ball through a maze 


20. size of order, time of day, sex of server 


cost of order of french fries, in cents per ounce 


21. brand of gasoline, driving speed, temperature 


gas mileage for car 


22. stamp (first class, air mail), zip code (used, not used), time of day when letter 
mailed 


number of days required for letter to be delivered to 
another city 


23. side efface (left, right), beard history (shaved once in two years — sideburns, shaved length of whiskers 3 days after shaving 
over 600 times in two years — just below sideburns) 


24. eyes used (both, right), location of observer, distance 


number of times (out of 1 5) that correct gender of 
passerby was determined by experimenter with poor 
eyesight wearing no glasses 


25. distance to target, guns (A, B), powders (C, D) 


number of shot that penetrated a one foot diameter 
circle on the target 


26. oven temperature, length of heating, amount of water 


height of cake 


27. strength of developer, temperature, degree of agitation 


density of photographic film 


28. brand of rubber hand, size, temperature 


length of rubber band before it broke 


29. viscosity of oil, type of pick-up shoes., number of teeth in gear 


speed of H.O. scale slot racers 


30. type of tire, brand of gas, driver (A, B) 


time for cat to cover one-quarter mile 


31. temperature, stirring rate, amount of solvent 


time to dissolve tabfe salt 


32. amounts of cooking wine, oyster sauce, sesame oil 


taste of stewed chicken 


33. type of surface, object (slide rule, ruler, silver dollar), pushed 1 ? (no, yes) 


angle necessary to make object slide 


34. ambient temperature, choke setting, number of charges 


number of kicks necessary to start motorcycle 


35. temperature, location in oven, biscuits covered while baking? (no, yes) 


lime to bake biscuits 


36. temperature of water, amount of grease, amount of water conditioner 


quantity of suds produced in kitchen blender 


37. person putting daughter to bed (mother, father), bed time, place (home, grandpar- 
ents) 


toys child chose to sleep with 


38. amount of fight in room, type of music played, volume 


correct answers on simple arithmetic test, time re- 
quired to complete test, words remembered (from list 
of 15) 


39. amounts of added Turkish, Latakia, and Perique tobaccos 


bite, smoking characteristics, aroma, and taste of to- 
bacco mixture 


40. temperature, humidity, rock salt 


time to melt ice 


41 . number of cards dealt at one time, position of picker relative to the dealer 


points in games of sheepshead, a card game 


42. marijuana (no, yes), tequiUa (no, yes), sauna (no, yes) 


pleasure experienced in subsequent sexual intercourse 
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Table 1. continued 



variables 



responses 



43. amounts of flour, eggs, milk 


taste of pancakes, consensus of group of four living 
together 


44. brand of suntan lotion, altitude, skier 


time to get sunburned 


45. amount of sleep the night before, substantial exercise during the day? (no, yes), eat 
right before going to bed? (no, yes) 


soundness of sleep, average reading from 5 persons 


46. brand of tape deck used for playing music, bass level, treble level, synthesizer? (no, 
yes) 


clearness and quality of sound, and absence of noise 


47. Type of filter paper, beverage to be filtered, volume of beverage 


time to filter 


48. type of ski, temperature, type of wax 


time to go down ski slope 


49. ambient temperature for dough when rising, amount of vegetable oil, number of 
onions 


four quality characteristics of pizza 


50. amount of fertilizer, location of seeds (3x3 Latin square) 


time for seeds to germinate 


51 . speed of kitchen blender, batch size of malt, blending time 


quality of ground malt for brewing beer 


52. soft drink (A, B), container (can, bottie), sugar free? (no, yes) 


taste of drink from paper cup 


53. child's weight (13, 22 pounds), spring tension (4, 8 cranks), swing orientation 
(level, tilted) 


number of swings and duration of these swings ob- 
tained from an automatic infant swing 


54. orientation of football, kick (ordinary, soccer style), steps taken before kick, shoe 
(soft, hard) 


distance football was kicked 


55. weight of bowling ball, spin, bowling lane (A, B) 


bowling pins knocked down 


56. distance from basket, type of shot, location on floor 


number of shots made (out of 10) with basketball 


57. temperature, position of glass when pouring soft drink, amount of sugar added 


amount of foam produced when pouring soft drink 
into glass 


58. brand of epoxy glue, ratio of hardener to resin, thickness of application, smoothness 
of surface, curing time 


strength of bond between two strips of aluminum 


59. amount of plant hormone, water (direct from tap, stood out for 24 hours), window 
in which plant was put 


root lengths of cuttings from purple passion vine after 
21 days 


60. amount of detergent (1/4, 1/2 cup), bleach (none, 1 cup), fabric softener (not used, 
used) 


ability to remove oil and grape juice stains 


61. skin thickness, water temperature, amount of salt 


time to cook Chinese meat dumpling 


62. appearance (with and without a crutch), location, time 


time to get a ride hitchhiking and number of cars that 
passed before getting a ride 


63. frequency of watering plants, use of plant food (no, yes), temperature of water 


growth rate of house plants 


64. plunger A up (slow, fast), plunger A down (slow, fast), plunger B up (slow, fast) 
plunger B down (slow, fast) 


reproducibility of automatic dilutor, optical density 
readings made with spectrophotometer 


65. temperature of gas chromatograph column, tube type (U, J), voltage 


size of unwanted droplet 


66. temperature, gas pressure, welding speed 


strength of polypropylene weld, manual operation 


67. concentration of lysozyme, pH, ionic strength, temperature 


rate of chemical reaction 


68, anhydrous barium peroxide powder, sulfur, charcoal dust 


length of time fuse powder burned and the evenness 
of burning 


69. air velocity, air temperature, rice bed depth 


time to dry wild rice 


70. concentration of lactose crystal, crystal size, rate of agitation 


spreadability of caramel candy 


71. positions of coating chamber, distribution plate, and lower chamber 


number of particles caught in a fluidized bed collector 


72. proportional band, manual reset, regulator pressure 


sensitivity of a pneumatic valve control system for a 
heat exchanger 



73. chloride concentration, phase ratio, total amine concentration, amount of preserva- 
tive added 



degree of separation of zinc from copper accom- 
plished by extraction 
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Table t. continued 



variables 



responses 



74. temperature, nitrate concentration, amount of preservative added 


measured nitrate concentration in sewage, comparison 
of three different methods 


75. solar radiation collector size, ratio of storage capacity to collector size, extent of 
.short-term intermittency of radiation, average daily radiation on three successive 
days 


efficiency of solar space-heating system, a computer 
simulation 


76. pH, dissolved oxygen content of water, temperature 


extent of corrosion of iron 


77. amount of sulfuric acid, time of shaking milk-acid mixture, time of final tempering 


measurement of butterfat content of milk 


78. mode (batch, time-sharing), job size, system utilization (low, high) 


time to complete job on computer 


79. flow rate of carrier gas, polarity of stationary liquid phase, temperature 


two different measures of efficiency of operation of 
gas chromatograph 


80. pH of assay buffer, incubation time, concentration of binder 


measured Cortisol level in human blood plasma 


81. aluminum, boron, cooling time 


extent of rock candy fracture of cast steel 


82. magnification, read out system (micrometer, electronic), stage light 


measurement of angle with photogrammetric instru- 
ment 


83. riser height, mold hardness, carbon equivalent 


changes in height, width, and length dimensions of 
cast metal 


84. amperage, contact tube height, travel speed, edge preparation 


quality of weld made by submerged arc welding pro- 
cess 


85. time, amount of magnesium oxide, amount of alloy 


recover of material by steam distillation 


86. pH, depth, time 


final moisture content of alfalfa protein 


87. deodorant, concentration of chemical, incubation time 


odor produced by material isolated from decaying ma- 
nure, after treatment 


88. temperature variation, concentration of cupric sulfate concentration of sulfuric acid 


limiting currents on totaling disk electrode 


89. air flow, diameter of bead, heat shield (no, yes) 


measured temperature of a heated plate 


90. voltage, warm-up procedure, bulb age 


sensitivity of m'icrodensitomeler 


91. pressure, amount of ferric chloride added, amount of lime added 


efficiency of vacuum filtration of sludge 


92. longitudinal feed rate, transverse feed rate, depth of cut 


longitudinal and thrust forces for surface grinding op- 
eration 


93. time between preparation of sample and refluxing, reflux time, time between end of 
reflux and start of titrating 


chemical oxygen demand of samples with same 
amount of waste (acetanilide) 


94. speed of rotation, thrust load, method of lubrication 


torque of taper roller bearings 


95. type of activated carbon, amount of carbon, pH 


adsorption characteristics of activated carbon used 
with municipal waste water 


96. amounts of nickel, manganese, carbon 


impact strength of steel alloy 


97. form (broth, gravy), added broth (no, yes), added fat (no, yes), type of meat (lamb, 
beef) 


percentage of panelists correctly identifying which 
samples were lamb 


98. well (A, B), depth of probe, method of analysis (peak height, plajiimeter) 


methane concentration in completed sanitary landfill 


99. paste (A, B), preparation of skin (no, yes), site (sternum, forearm) 


electrocardiogram reading 


100. lime dosage, time of flocculation, mixing speed 


removal of turbidity and hardness from water 


101. temperature difference between surface and bottom waters, thickness of surface 

layer, jet distance to thermocline, velocity of jet, temperature difference between jet 
and bottom waters 


mixing time for an initially thermally stratified tank of 
water 
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